reference class
A Factorized Probabilistic Model of the Semantics of Vague Temporal Adverbials Relative to Different Event Types
Kenneweg, Svenja, Deigmöller, Jörg, Eggert, Julian, Cimiano, Philipp
V ague temporal adverbials, such as "recently," "just" and "long time ago," describe the temporal distance between a past event and the utterance time, but leave the exact duration underspec-ified. In this paper, we introduce a factorized model that captures the semantics of these adverbials as probabilistic distributions. These distributions are composed with event-specific distributions to yield a contextualized meaning for an adverbial applied to a specific event. We fit the model's parameters using existing data capturing judgements of native speakers regarding the applicability of these vague temporal adverbials to events that took place a given time ago. Comparing our approach to a non-factorized model based on a single Gaussian distribution for each pair of event and temporal adverbial, we find out that, while both models have similar predictive power, our model is preferable in terms of Occam's razor, as it is simpler and has a better extendability.
- Europe > Germany (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Gulf of Mexico > Central GOM (0.04)
Reasoning and Tools for Human-Level Forecasting
Hsieh, Elvis, Fu, Preston, Chen, Jonathan
Language models (LMs) trained on web-scale datasets are largely successful due to their ability to memorize large amounts of training data, even if only present in a few examples. These capabilities are often desirable in evaluation on tasks such as question answering but raise questions about whether these models can exhibit genuine reasoning or succeed only at mimicking patterns from the training data. This distinction is particularly salient in forecasting tasks, where the answer is not present in the training data, and the model must reason to make logical deductions. We present Reasoning and Tools for Forecasting (RTF), a framework of reasoning-and-acting (ReAct) agents that can dynamically retrieve updated information and run numerical simulation with equipped tools. We evaluate our model with questions from competitive forecasting platforms and demonstrate that our method is competitive with and can outperform human predictions. This suggests that LMs, with the right tools, can indeed think and adapt like humans, offering valuable insights for real-world decision-making.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Oceania > Australia (0.04)
- North America > Canada (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
To which reference class do you belong? Measuring racial fairness of reference classes with normative modeling
Rutherford, Saige, Wolfers, Thomas, Fraza, Charlotte, Harrnet, Nathaniel G., Beckmann, Christian F., Ruhe, Henricus G., Marquand, Andre F.
Reference classes in healthcare establish healthy norms, such as pediatric growth charts of height and weight, and are used to chart deviations from these norms which represent potential clinical risk. How the demographics of the reference class influence clinical interpretation of deviations is unknown. Using normative modeling, a method for building reference classes, we evaluate the fairness (racial bias) in reference models of structural brain images that are widely used in psychiatry and neurology. We test whether including race in the model creates fairer models. We predict self-reported race using the deviation scores from three different reference class normative models, to better understand bias in an integrated, multivariate sense. Across all of these tasks, we uncover racial disparities that are not easily addressed with existing data or commonly used modeling techniques. Our work suggests that deviations from the norm could be due to demographic mismatch with the reference class, and assigning clinical meaning to these deviations should be done with caution. Our approach also suggests that acquiring more representative samples is an urgent research priority.
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (11 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
- (2 more...)
Reconciling Individual Probability Forecasts
Roth, Aaron, Tolbert, Alexander, Weinstein, Scott
Probabilistic modelling in machine learning and statistics predicts "individual probabilities" as a matter of course. In weather forecasting, we speak of the probability of rain tomorrow; in life insurance underwriting we speak of the probability that Alice will die in the next 12 months; in recidivism prediction we speak of the probability that an inmate Bob will commit a violent crime within 18 months of being released on parole; in predictive medicine we speak of the probability that Carol will develop breast cancer before the age of 50 -- and so on. But these are not repeated events: we have no way of directly measuring an "individual probability" -- and indeed, even the semantics of an individual probability are unclear and have been the subject of deep interrogation within the philosophy of science and statistics [Hájek, 2007, Dawid, 2017] and theoretical computer science [Dwork et al., 2021]. Within the philosophy of science, puzzles related to individual probability have been closely identified with "the reference class problem" [Hájek, 2007]. This is a close cousin of a concern that has recently arisen in the context of fairness in machine learning called the "predictive multiplicity problem" (a focal subset of "model multiplicity problems") [Marx et al., 2020, Black et al., 2022] which Breiman [2001] earlier called the "Rashomon Effect". At the core of both of these problems is the fact that from a data sample that is much smaller than the data universe (i.e. the set of all possible observations), we will have observed at most one individual with a particular set of characteristics, and at most one outcome for the event that an"individual probability" speaks to: It will either rain tomorrow or it will not; Alice will either die within the next year or she will not; etc. We do not have the luxury of observing a large number of repetitions and taking averages. Dawid[2017] lays out two broad classes of perspectives on individual probabilities: the group to individual perspective and the individual to group perspective.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Greenland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Banking & Finance > Insurance (0.88)
- Health & Medicine > Therapeutic Area (0.86)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.54)
Interpretable ML for Imbalanced Data
Dablain, Damien A., Bellinger, Colin, Krawczyk, Bartosz, Aha, David W., Chawla, Nitesh V.
Deep learning models are being increasingly applied to imbalanced data in high stakes fields such as medicine, autonomous driving, and intelligence analysis. Imbalanced data compounds the black-box nature of deep networks because the relationships between classes may be highly skewed and unclear. This can reduce trust by model users and hamper the progress of developers of imbalanced learning algorithms. Existing methods that investigate imbalanced data complexity are geared toward binary classification, shallow learning models and low dimensional data. In addition, current eXplainable Artificial Intelligence (XAI) techniques mainly focus on converting opaque deep learning models into simpler models (e.g., decision trees) or mapping predictions for specific instances to inputs, instead of examining global data properties and complexities. Therefore, there is a need for a framework that is tailored to modern deep networks, that incorporates large, high dimensional, multi-class datasets, and uncovers data complexities commonly found in imbalanced data (e.g., class overlap, sub-concepts, and outlier instances). We propose a set of techniques that can be used by both deep learning model users to identify, visualize and understand class prototypes, sub-concepts and outlier instances; and by imbalanced learning algorithm developers to detect features and class exemplars that are key to model performance. Our framework also identifies instances that reside on the border of class decision boundaries, which can carry highly discriminative information. Unlike many existing XAI techniques which map model decisions to gray-scale pixel locations, we use saliency through back-propagation to identify and aggregate image color bands across entire classes. Our framework is publicly available at \url{https://github.com/dd1github/XAI_for_Imbalanced_Learning}
- North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
- North America > United States > Virginia > Richmond (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Transportation > Air (0.88)
- Transportation > Ground > Road (0.48)
- Government > Regional Government > North America Government > United States Government (0.46)
Real-Time Detection of Anomalies in Large-Scale Transient Surveys
Muthukrishna, Daniel, Mandel, Kaisey S., Lochner, Michelle, Webb, Sara, Narayan, Gautham
New time-domain surveys, such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), will observe millions of transient alerts each night, making standard approaches of visually identifying new and interesting transients infeasible. We present two novel methods of automatically detecting anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first modelling approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We demonstrate our methods' ability to provide anomaly scores as a function of time on light curves from the Zwicky Transient Facility. We show that the flexibility of neural networks, the attribute that makes them such a powerful tool for many regression tasks, is what makes them less suitable for anomaly detection when compared with our parametric model. The parametric model is able to identify anomalies with respect to common supernova classes with high precision and recall scores, achieving area under the precision-recall curves (AUCPR) above 0.79 for most rare classes such as kilonovae, tidal disruption events, intermediate luminosity transients, and pair-instability supernovae. Our ability to identify anomalies improves over the lifetime of the light curves. Our framework, used in conjunction with transient classifiers, will enable fast and prioritised followup of unusual transients from new large-scale surveys.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Oceania > Australia (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis
The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individual might benefit from an intervention. The majority of this literature has concentrated on the estimation of the effect of treatment on binary outcomes. However, many medical interventions are evaluated in terms of their effect on future events, which are subject to loss to follow-up. In this study, we describe a framework for the estimation of heterogeneous treatment effect in terms of differences in time-to-event (survival) probabilities. We divide the problem into three phases: (1) estimation of treatment effect conditioned on unique sets of the covariate vector; (2) identification of features important for heterogeneity using an ensemble of non-parametric variable importance methods; and (3) estimation of treatment effect on the reference classes defined by the previously selected features, using one-step Targeted Maximum Likelihood Estimation. We conducted a series of simulation studies and found that this method performs well when either sample size or event rate is high enough and the number of covariates contributing to the effect heterogeneity is moderate. An application of this method to a clinical case study was conducted by estimating the effect of oral anticoagulants on newly diagnosed non-valvular atrial fibrillation patients using data from the UK Clinical Practice Research Datalink.
- Europe > United Kingdom (0.14)
- Oceania > Australia > New South Wales (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Research Report > Strength Medium (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- (2 more...)
Conditional Sparse $\ell_p$-norm Regression With Optimal Probability
Hainline, John, Juba, Brendan, Le, Hai S., Woodruff, David
We consider the following conditional linear regression problem: the task is to identify both (i) a $k$-DNF condition $c$ and (ii) a linear rule $f$ such that the probability of $c$ is (approximately) at least some given bound $\mu$, and $f$ minimizes the $\ell_p$ loss of predicting the target $z$ in the distribution of examples conditioned on $c$. Thus, the task is to identify a portion of the distribution on which a linear rule can provide a good fit. Algorithms for this task are useful in cases where simple, learnable rules only accurately model portions of the distribution. The prior state-of-the-art for such algorithms could only guarantee finding a condition of probability $\Omega(\mu/n^k)$ when a condition of probability $\mu$ exists, and achieved an $O(n^k)$-approximation to the target loss, where $n$ is the number of Boolean attributes. Here, we give efficient algorithms for solving this task with a condition $c$ that nearly matches the probability of the ideal condition, while also improving the approximation to the target loss. We also give an algorithm for finding a $k$-DNF reference class for prediction at a given query point, that obtains a sparse regression fit that has loss within $O(n^k)$ of optimal among all sparse regression parameters and sufficiently large $k$-DNF reference classes containing the query point.
- Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.06)
- Asia > Afghanistan > Parwan Province > Charikar (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)